NEUTrans: a Phrase-Based SMT System for CWMT2009
نویسندگان
چکیده
In this report, we describe our (NEUNLPLab) phrase-based statistical machine translation (SMT) system (NEUTrans) for the participation of news domain Chinese-to-English single-system translation task in the 5 China workshop on Machine Translation (CWMT2009). We submitted four translation results for this task. In this report, we first give an introduction of the framework and the key techniques used in our system, then analyze on the experimental results, and finally discuss the issues we found during the development of the system.
منابع مشابه
Phrase Reordering Model Integrating Syntactic Knowledge for SMT
Reordering model is important for the statistical machine translation (SMT). Current phrase-based SMT technologies are good at capturing local reordering but not global reordering. This paper introduces syntactic knowledge to improve global reordering capability of SMT system. Syntactic knowledge such as boundary words, POS information and dependencies is used to guide phrase reordering. Not on...
متن کاملChained System: A Linear Combination of Different Types of Statistical Machine Translation Systems
The paper explores a way to learn post-editing fixes of raw MT outputs automatically by combining two different types of statistical machine translation (SMT) systems in a linear fashion. Our proposed system (which we call a chained system) consists of two SMT systems: (i) a syntax-based SMT system and (ii) a phrase-based SMT system (Koehn, 2004). We first translate source sentences of the bite...
متن کاملLiterature Survey: Study of Reordering in Pivot Based SMT
Pivot Based SMT solves the problem of scarcity of source-target parallel corpus by introducing a third resource rich ‘pivot’ language. Triangulation method in Pivot Based SMT is a method that uses the pivot language to induce new phrase pairs into the phrase table, this process is known as ‘Phrase Table Triangulation’. Phrase Table Triangulation has been extensively studied by many researchers....
متن کاملConnecting Phrase based Statistical Machine Translation Adaptation
Although more additional corpora are now available for Statistical Machine Translation (SMT), only the ones which belong to the same or similar domains with the original corpus can indeed enhance SMT performance directly. Most of the existing adaptation methods focus on sentence selection. In comparison, phrase is a smaller and more fine grained unit for data selection, therefore we propose a s...
متن کاملLexical Syntax for Statistical Machine Translation
Statistical Machine Translation (SMT) is by far the most dominant paradigm of Machine Translation. This can be justified by many reasons, such as accuracy, scalability, computational efficiency and fast adaptation to new languages and domains. However, current approaches of Phrase-based SMT lacks the capabilities of producing more grammatical translations and handling long-range reordering whil...
متن کامل